Hayward
Exponential Family Estimation via Adversarial Dynamics Embedding
Bo Dai, Zhen Liu, Hanjun Dai, Niao He, Arthur Gretton, Le Song, Dale Schuurmans
Theorem 1 (Fencheldualoflog-partition (Wainwrightand Jordan,2008)) Let H(q): = R q(x) logq(x)dx. The C. Compared optimization Goodfello, 2014; Arjovsk, 2017; Dai, 2017), thereversalmin-maxin (20), themajor sharesparameters updatesofthe accelerating learnedadv empirically 5. Similaroptimization(13) with (17).
- Asia > Middle East > Jordan (0.25)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- (6 more...)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- Europe > United Kingdom > England (0.05)
- Oceania > New Zealand (0.04)
- (10 more...)
It's all In the (Exponential) Family: An Equivalence between Maximum Likelihood Estimation and Control Variates for Sketching Algorithms
Kang, Keegan, Wang, Kerong, Zhang, Ding, Pratap, Rameshwar, Verma, Bhisham Dev, Wong, Benedict H. W.
Maximum likelihood estimators (MLE) and control variate estimators (CVE) have been used in conjunction with known information across sketching algorithms and applications in machine learning. We prove that under certain conditions in an exponential family, an optimal CVE will achieve the same asymptotic variance as the MLE, giving an Expectation-Maximization (EM) algorithm for the MLE. Experiments show the EM algorithm is faster and numerically stable compared to other root finding algorithms for the MLE for the bivariate Normal distribution, and we expect this to hold across distributions satisfying these conditions. We show how the EM algorithm leads to reproducibility for algorithms using MLE / CVE, and demonstrate how the EM algorithm leads to finding the MLE when the CV weights are known.
- Asia > China > Hong Kong (0.04)
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
- North America > United States > Virginia (0.04)
- (5 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Generative Modeling of Discrete Data Using Geometric Latent Subspaces
Gonzalez-Alvarado, Daniel, Cassel, Jonas, Petra, Stefania, Schnörr, Christoph
We introduce the use of latent subspaces in the exponential parameter space of product manifolds of categorial distributions, as a tool for learning generative models of discrete data. The low-dimensional latent space encodes statistical dependencies and removes redundant degrees of freedom among the categorial variables. We equip the parameter domain with a Riemannian geometry such that the spaces and distances are related by isometries which enables consistent flow matching. In particular, geodesics become straight lines which makes model training by flow matching effective. Empirical results demonstrate that reduced latent dimensions suffice to represent data for generative modeling.
- North America > United States > California > Alameda County > Hayward (0.04)
- Asia > Middle East > Jordan (0.04)
- Europe > United Kingdom > North Sea > Southern North Sea (0.04)
- (2 more...)
A tensor network formalism for neuro-symbolic AI
Goessmann, Alex, Schütte, Janina, Fröhlich, Maximilian, Eigel, Martin
The unification of neural and symbolic approaches to artificial intelligence remains a central open challenge. In this work, we introduce a tensor network formalism, which captures sparsity principles originating in the different approaches in tensor decompositions. In particular, we describe a basis encoding scheme for functions and model neural decompositions as tensor decompositions. The proposed formalism can be applied to represent logical formulas and probability distributions as structured tensor decompositions. This unified treatment identifies tensor network contractions as a fundamental inference class and formulates efficiently scaling reasoning algorithms, originating from probability theory and propositional logic, as contraction message passing schemes. The framework enables the definition and training of hybrid logical and probabilistic models, which we call Hybrid Logic Network. The theoretical concepts are accompanied by the python library tnreason, which enables the implementation and practical use of the proposed architectures.
- Europe > Germany (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (11 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Bayesian Empirical Bayes: Simultaneous Inference from Probabilistic Symmetries
Wu, Bohan, Weinstein, Eli N., Blei, David M.
Empirical Bayes (EB) improves the accuracy of simultaneous inference "by learning from the experience of others" (Efron, 2012). Classical EB theory focuses on latent variables that are iid draws from a fitted prior (Efron, 2019). Modern applications, however, feature complex structure, like arrays, spatial processes, or covariates. How can we apply EB ideas to these settings? We propose a generalized approach to empirical Bayes based on the notion of probabilistic symmetry. Our method pairs a simultaneous inference problem-with an unknown prior-to a symmetry assumption on the joint distribution of the latent variables. Each symmetry implies an ergodic decomposition, which we use to derive a corresponding empirical Bayes method. We call this methodBayesian empirical Bayes (BEB). We show how BEB recovers the classical methods of empirical Bayes, which implicitly assume exchangeability. We then use it to extend EB to other probabilistic symmetries: (i) EB matrix recovery for arrays and graphs; (ii) covariate-assisted EB for conditional data; (iii) EB spatial regression under shift invariance. We develop scalable algorithms based on variational inference and neural networks. In simulations, BEB outperforms existing approaches to denoising arrays and spatial data. On real data, we demonstrate BEB by denoising a cancer gene-expression matrix and analyzing spatial air-quality data from New York City.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > California > Alameda County > Hayward (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (8 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
- (2 more...)
Comparing Two Proxy Methods for Causal Identification
Guo, Helen, Ogburn, Elizabeth L., Shpitser, Ilya
Identifying causal effects in the presence of unmeasured variables is a fundamental challenge in causal inference, for which proxy variable methods have emerged as a powerful solution. We contrast two major approaches in this framework: (1) bridge equation methods, which leverage solutions to integral equations to recover causal targets, and (2) array decomposition methods, which recover latent factors composing counterfactual quantities by exploiting unique determination of eigenspaces. We compare the model restrictions underlying these two approaches and provide insight into implications of the underlying assumptions, clarifying the scope of applicability for each method.
- North America > United States > California > Alameda County > Hayward (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Africa > Senegal > Kolda Region > Kolda (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Hungary > Budapest > Budapest (0.04)
- North America > United States > New Jersey > Hudson County > Hoboken (0.04)
- (4 more...)
Theory and computation for structured variational inference
Sheng, Shunan, Wu, Bohan, Zhu, Bennett, Chewi, Sinho, Pooladian, Aram-Alexandre
Structured variational inference constitutes a core methodology in modern statistical applications. Unlike mean-field variational inference, the approximate posterior is assumed to have interdependent structure. We consider the natural setting of star-structured variational inference, where a root variable impacts all the other ones. We prove the first results for existence, uniqueness, and self-consistency of the variational approximation. In turn, we derive quantitative approximation error bounds for the variational approximation to the posterior, extending prior work from the mean-field setting to the star-structured setting. We also develop a gradient-based algorithm with provable guarantees for computing the variational approximation using ideas from optimal transport theory. We explore the implications of our results for Gaussian measures and hierarchical Bayesian models, including generalized linear models with location family priors and spike-and-slab priors with one-dimensional debiasing. As a by-product of our analysis, we develop new stability results for star-separable transport maps which might be of independent interest.
- North America > Canada > Ontario > Toronto (0.14)
- Asia > Middle East > Jordan (0.04)
- North America > United States > New York (0.04)
- (10 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)
Copula-Stein Discrepancy: A Generator-Based Stein Operator for Archimedean Dependence
Aich, Agnideep, Aich, Ashit Baran
Kernel Stein discrepancies (KSDs) have become a principal tool for goodness-of-fit testing, but standard KSDs are often insensitive to higher-order dependency structures, such as tail dependence, which are critical in many scientific and financial domains. We address this gap by introducing the Copula-Stein Discrepancy (CSD), a novel class of discrepancies tailored to the geometry of statistical dependence. By defining a Stein operator directly on the copula density, CSD leverages the generative structure of dependence, rather than relying on the joint density's score function. For the broad class of Archimedean copulas, this approach yields a closed-form Stein kernel derived from the scalar generator function. We provide a comprehensive theoretical analysis, proving that CSD (i) metrizes weak convergence of copula distributions, ensuring it detects any mismatch in dependence; (ii) has an empirical estimator that converges at the minimax optimal rate of $O_P(n^{-1/2})$; and (iii) is provably sensitive to differences in tail dependence coefficients. The framework is extended to general non-Archimedean copulas, including elliptical and vine copulas. Computationally, the exact CSD kernel evaluation scales linearly in dimension, while a novel random feature approximation reduces the $n$-dependence from quadratic $O(n^2)$ to near-linear $\tilde{O}(n)$, making CSD a practical and theoretically principled tool for dependence-aware inference.
- Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
- North America > United States > Louisiana > Lafayette Parish > Lafayette (0.04)
- North America > United States > California > Alameda County > Hayward (0.04)
- (2 more...)